47 research outputs found
Festparameter-Algorithmen fuer die Konsens-Analyse Genomischer Daten
Fixed-parameter algorithms offer a constructive and powerful approach
to efficiently obtain solutions for NP-hard problems combining two
important goals: Fixed-parameter algorithms compute optimal solutions
within provable time bounds despite the (almost inevitable)
computational intractability of NP-hard problems. The essential idea
is to identify one or more aspects of the input to a problem as the
parameters, and to confine the combinatorial explosion of
computational difficulty to a function of the parameters such that the
costs are polynomial in the non-parameterized part of the input. This
makes especially sense for parameters which have small values in
applications. Fixed-parameter algorithms have become an established
algorithmic tool in a variety of application areas, among them
computational biology where small values for problem parameters are
often observed. A number of design techniques for fixed-parameter
algorithms have been proposed and bounded search trees are one of
them. In computational biology, however, examples of bounded search
tree algorithms have been, so far, rare.
This thesis investigates the use of bounded search tree algorithms for
consensus problems in the analysis of DNA and RNA data. More
precisely, we investigate consensus problems in the contexts of
sequence analysis, of quartet methods for phylogenetic reconstruction,
of gene order analysis, and of RNA secondary structure comparison. In
all cases, we present new efficient algorithms that incorporate the
bounded search tree paradigm in novel ways. On our way, we also obtain
results of parameterized hardness, showing that the respective
problems are unlikely to allow for a fixed-parameter algorithm, and we
introduce integer linear programs (ILP's) as a tool for classifying
problems as fixed-parameter tractable, i.e., as having fixed-parameter
algorithms. Most of our algorithms were implemented and tested on
practical data.Festparameter-Algorithmen bieten einen konstruktiven Ansatz zur
Loesung von kombinatorisch schwierigen, in der Regel NP-harten
Problemen, der zwei Ziele beruecksichtigt: innerhalb von beweisbaren
Laufzeitschranken werden optimale Ergebnisse berechnet. Die
entscheidende Idee ist dabei, einen oder mehrere Aspekte der
Problemeingabe als Parameter der Problems aufzufassen und die
kombinatorische Explosion der algorithmischen Schwierigkeit auf diese
Parameter zu beschraenken, so dass die Laufzeitkosten polynomiell in
Bezug auf den nicht-parametrisierten Teil der Eingabe sind. Gibt es
einen Festparameter-Algorithmus fuer ein kombinatorisches Problem,
nennt man das Problem festparameter-handhabbar. Die Entwicklung von
Festparameter-Algorithmen macht vor allem dann Sinn, wenn die
betrachteten Parameter im Anwendungsfall nur kleine Werte
annehmen. Festparameter-Algorithmen sind zu einem algorithmischen
Standardwerkzeug in vielen Anwendungsbereichen geworden, unter anderem
in der algorithmischen Biologie, wo in vielen Anwendungen kleine
Parameterwerte beobachtet werden koennen. Zu den bekannten Techniken
fuer den Entwurf von Festparameter-Algorithmen gehoeren unter anderem
groessenbeschraenkte Suchbaeume. In der algorithmischen Biologie gibt
es bislang nur wenige Beispiele fuer die Anwendung von
groessenbeschraenkten Suchbaeumen.
Diese Arbeit untersucht den Einsatz groessenbeschraenkter Suchbaeume
fuer NP-harte Konsens-Probleme in der Analyse von DNS- und
RNS-Daten. Wir betrachten Konsens-Probleme in der Analyse von
DNS-Sequenzdaten, in der Analyse von sogenannten Quartettdaten zur
Erstellung von phylogenetischen Hypothesen, in der Analyse von Daten
ueber die Anordnung von Genen und beim Vergleich von
RNS-Strukturdaten. In allen Faellen stellen wir neue effiziente
Algorithmen vor, in denen das Paradigma der groessenbeschraenkten
Suchbaeume auf neuartige Weise realisiert wird. Auf diesem Weg zeigen
wir auch Ergebnisse parametrisierter Haerte, die zeigen, dass fuer
die dabei betrachteten Probleme ein Festparameter-Algorithmus
unwahrscheinlich ist. Ausserdem fuehren wir ganzzahliges lineares
Programmieren als eine neue Technik ein, um die
Festparameter-Handhabbarkeit eines Problems zu zeigen. Die Mehrzahl
der hier vorgestellten Algorithmen wurde implementiert und auf
Anwendungsdaten getestet
Worst-case upper bounds for MAX-2-SAT with an application to MAX-CUT
AbstractThe maximum 2-satisfiability problem (MAX-2-SAT) is: given a Boolean formula in 2-CNF, find a truth assignment that satisfies the maximum possible number of its clauses. MAX-2-SAT is MAX-SNP-complete. Recently, this problem received much attention in the contexts of (polynomial-time) approximation algorithms and (exponential-time) exact algorithms. In this paper, we present an exact algorithm solving MAX-2-SAT in time poly(L)·2K/5, where K is the number of clauses and L is their total length. In fact, the running time is only poly(L)·2K2/5, where K2 is the number of clauses containing two literals. This bound implies the bound poly(L)·2L/10. Our results significantly improve previous bounds: poly(L)·2K/2.88 (J. Algorithms 36 (2000) 62–88) and poly(L)·2K/3.44 (implicit in Bansal and Raman (Proceedings of the 10th Annual Conference on Algorithms and Computation, ISAAC’99, Lecture Notes in Computer Science, Vol. 1741, Springer, Berlin, 1999, pp. 247–258.))As an application, we derive upper bounds for the (MAX-SNP-complete) maximum cut problem (MAX-CUT), showing that it can be solved in time poly(M)·2M/3, where M is the number of edges in the graph. This is of special interest for graphs with low vertex degree
arXiv:cs.CC/0205056 v1 21 May 2002
Abstract We show that Closest Substring, one of the most important problems in the field of biological sequence analysis, is W[1]-hard when parameterized by the number k of input strings (and remains so, even over a binary alphabet). This problem is therefore unlikely to be solvable in time O(f(k) · n c ) for any function f of k and constant c independent of k. The problem can therefore be expected to be intractable, in any practical sense, for k ≥ 3. Our result supports the intuition that Closest Substring is computationally much harder than the special case of Closest String, although both problems are NP-complete. We also prove W[1]-hardness for other parameterizations in the case of unbounded alphabet size. Our W[1]-hardness result for Closest Substring generalizes to Consensus Patterns, a problem of similar significance in computational biology. Introduction Motif search problems are of central importance for sequence analysis in computational molecular biology. These problems have applications in fields such as genetic drug target identification or signal finding (see What is currently known about these two problems is summarized as follows. The Closest Substring Problem. 1. Closest Substring is NP-complete, and remains so for the special case of the Closest String problem, where the string s that we search for is of same length as the input strings. Closest String is NP-complete even for the further restriction to a binary alphabe
Swiftly Computing Center Strings
Hufsky F, Kuchenbecker L, Jahn K, Stoye J, Böcker S. Swiftly Computing Center Strings. BMC Bioinformatics. 2011;12(1): 106
A fixed-parameter algorithm for Minimum Quartet Inconsistency
Given n taxa, exactly one topology for every subset of four taxa, and a positive integer k (the parameter), the Minimum Quartet Inconsistency (MQI) problem is the question whether we can find an evolutionary tree inducing a set of quartet topologies that differs from the given set in only k quartet topologies. The more general problem where we are not necessarily given a topology for every subset of four taxa appears to be fixedparameter intractable. For MQI, however, which is also NP-complete, we can compute the required tree in time O(4 k · n + n 4). This means that the problem is fixed-parameter tractable and that in the case of a small number k of “errors ” the tree reconstruction can be done efficiently. In particular, for minimal k, our algorithm can produce all solutions that resolve k errors. Additionally, we discuss significant heuristic improvements. Experiments underline the practical relevance of our solutions. 〈Keywords: computational biology, minimum quartet inconsistency, parameterized complexity, phylogeny, quartet methods〉
Evaluating an algorithm for parameterized Minimum Quartet Inconsistency
Introduction. We experimentally evaluate an algorithm solving the problem to reconstruct a binary (evolutionary) tree from a complete set of quartet topologies in case of a limited number of errors: MINIMUM QUARTET INCONSISTENCY (MQI) Input: A set S of n taxa and a set Q of quartet topologies such that there is exactly one topology for every quartet set corresponding to S, and an integer k. Question: Is there an evolutionary tree T where the leaves are bijectively labeled by the elements from S such that the set of quartet topologies induced by T differs from Q in at most k quartet topologies? a d b f c d f e a f b e b d c e b c d f a d c e a d f e a b e d a c b f a c b d b d f e a c d f b e c f a c b e a c e f a b d f e c k = 2 Our results. . In [5], we give an algorithm computing the required tree in time O(4 ), thus showing that MQI is fixed parameter tractable. . Here, we present some experiments on artificially generated as well as on biological data taken from
der Eberhard-Karls-Universität Tübingen zur Erlangung des Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.)
verfasst und nur die angegebenen Quellen benutzt zu haben. i Zusammenfassun
Faster Exact Algorithms for Hard Problems: A Parameterized Point of View
Recent times have seen quite some progress in the development of "efficient" exponential time algorithms for NP-hard problems. These results are also tightly related to the so-called theory of fixed parameter tractability. In this incomplete, personally biased survey, we reflect on some recent developments and prospects in the field of fixed parameter algorithms